Illinois
Evaluating the Inductive Abilities of Large Language Models: Why Chain-of-Thought Reasoning Sometimes Hurts More Than Helps
Large Language Models (LLMs) have shown remarkable progress across domains, yet their ability to perform inductive reasoning--inferring latent rules from sparse examples--remains limited. It is often assumed that chain-of-thought (CoT) prompting, as used in Large Reasoning Models (LRMs), enhances such reasoning. We investigate this assumption with creating four controlled, diagnostic game-based tasks--chess, Texas Hold'em, dice games, and blackjack--with hidden humandefined rules. We find that CoT reasoning can degrade inductive performance, with LRMs often underperforming their non-reasoning counterparts. To explain this, we present a theoretical framework that reveals how reasoning steps can amplify error through three failure modes: incorrect sub-task decomposition, incorrect sub-task solving, and incorrect final answer summarization. Based on our theoretical and empirical analysis, we introduce structured interventions that adapt CoT generation according to our identified failure types. These interventions improve inductive accuracy without retraining. Our findings suggest that effective (CoT) reasoning depends not only on taking more steps but also on ensuring those steps are well-structured.
'I was taken from school and trained to fly UFOs with my mind,' claims child genius
Terrifying stomach cancer explosion sweeps the US: After fitness influencer's shock death, experts reveal subtle early signs that are too often ignored... and lifestyle tweaks that can PREVENT it Actress, 43, announces she is expecting with sweet video after detailing'complicated' journey to motherhood and hope of having third child Trump foe Rosie O'Donnell to replace Jimmy Kimmel as he steps back from his show Deadly secrets of gorgeous California enclave where college girls were killed by a'sneaker'... now experts say they could have been SAVED The other women left devastated by Jelly Roll's divorce: Why his daughter is now'disgusted'... as Bunnie's baby bombshell rocks Nashville The shaming of America's original mommy influencer after tragedy that divided the nation: Bode Miller's wife Morgan breaks cover to reveal agonizing regret that still haunts her since daughter's drowning Trump boasts there's'no limits' to his power and posts bizarre memo by fake historian comparing him to Hitler More young Americans are living with their parents than ever before... and there is a shocking reason behind the boomerang trend I was mortified when my husband always said no to sex. Then I realised the mistake I was making. This is the change that's completely transformed marital love-making in middle age: ALICE SNAPE Revealed: Hero, 24, who saved man's LIFE in dramatic rescue during New York Knicks victory parade after defying cops' orders: 'I'm just another New Yorker' REVEALED: Gavin Newsom steered millions of dollars of donations to nonprofits connected to his wife... as Trump's DOJ probes couple The shingles vaccine could lower dementia risk'by up to a quarter' - but scientists are still puzzled why Farce of Obama's $850m'monstrosity': As clucking liberal elite cheer Barack's grand opening, outraged Chicago locals tell HARRIET ALEXANDER awkward truth about library Why turnips MUST be in your grocery cart if you're trying to lose weight Taco Bell's finally fixes a glaring menu gap - and brings back a fan favorite after years Mom thought popular'natural' health supplement was safer than Xanax. She took it... then never woke up. Don't make the same mistake Mother and child in critical condition after being swallowed into ocean by ANOTHER monstrous California wave... just days after college students were killed by breaker'I was taken from school and trained to fly UFOs with my mind,' claims child genius A former gifted child has come forward with claims that he was removed from public school and secretly trained to develop psychic abilities for military and UFO-related applications.
Humanoid robot is spotted BEGGING on a street in China - claiming it has 'no money to recharge'
Gilgo Beach serial killer Rex Heuermann's ex-wife reacts to his sentencing as monster who killed eight women is transferred to new prison to begin life behind bars Boy, three, 'attacked by at least one crocodile' after being'thrown into zoo pit by man with learning difficulties who broke away from carers' - as suspect'not fit for interview' is bailed Jelly Roll stops concert to respond to wife Bunnie XO's bombshell podcast on their divorce Hegseth puts NATO on notice as he launches review of US troops in Europe and blasts allies for'shameful' behavior I was mortified when my husband always said no to sex. Then I realised the mistake I was making. This is the change that's completely transformed marital love-making in middle age: ALICE SNAPE Mom thought popular'natural' health supplement was safer than Xanax. She took it... then never woke up. Don't make the same mistake JD Vance turns on Israeli allies who are criticizing Trump's Iran deal: 'Wake up and smell reality' The other women left devastated by Jelly Roll's divorce... why his daughter is now'disgusted'... and Bunnie XO's one red-line demand before she would agree to the split Joe Biden mumbles to himself and requires stage direction as he aimlessly wanders off at Obama's library debut Tourists run for their lives as gunfire erupts in New York's Times Square as terrified parents drag children to safety Heartbroken family of college girls who drowned dispute account of their final moments before they were swept out to sea as they mourn'responsible and kind' students Oscar-winning director's daughter and her husband's deaths'medically related' as cops give grim update after couple were found in SUV on California highway Furious woke woman storms out of restaurant because customers were singing National Anthem ...and vows never to return A bold new experiment to streamline how Americans buy new cars... and auto dealerships are already scared Secret White House blacklist leaked by insider: 'Worst' influencers named and shamed... as foul-mouthed backstabbing erupts Watch horrifying drone video that follows woman's plunge to death after bungee team threw her from bridge without rope Bill Clinton's VERY cozy moment with Michelle while Hillary looks the other way... and the best UNSEEN moments from Obama public library opening Farce of Obama's $850m'monstrosity': As clucking liberal elite cheer Barack's grand opening, outraged Chicago locals tell HARRIET ALEXANDER awkward truth about library Humiliating new joke about Trump that's the talk of Washington... as White House moles tell me there's more to this story than meets the eye: MARK HALPERIN Humanoid robot is spotted BEGGING on a street in China - claiming it has'no money to recharge' READ MORE: China unveils the world's first self-driving TOILET While many people worry that robots are coming to take their jobs, one unlucky bot seems to have fallen on hard times.
Topology-Aware Conformal Prediction for Stream Networks
Existing approaches either neglect dependencies, leading to overly conservative predictions, or rely solely on data-driven estimations, failing to capture the rich topological structure of the network. To address these challenges, we propose Spatio-Temporal Adaptive Conformal Inference (STACI), a novel framework that integrates network topology and temporal dynamics into the conformal prediction framework. STACIintroduces a topology-aware nonconformity score that respects directional flow constraints and dynamically adjusts prediction sets to account for temporal distributional shifts. We provide theoretical guarantees on the validity of our approach and demonstrate its superior performance on both synthetic and real-world datasets. Our results show that STACIeffectively balances prediction efficiency and coverage, outperforming existing conformal prediction methods for stream networks.
'We had to get out of the way': The backlash over delivery robots
'We had to get out of the way': The backlash over delivery robots The first time Chicago resident John Roberts saw a delivery robot trundling down the sidewalk on his street he was impressed. I actually thought they were kind of neat - it felt futuristic, he says. But his attitude started to change when, soon after, he was out for a walk with his family. As another robot approached, they found themselves having to dodge it. To us it felt a little off - the fact that we were on the one strip reserved for walking, and we were having to get out of the way, says Roberts.
RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes
Although COLMAP has long remained the predominant method for camera parameter optimization in static scenes, it is constrained by its lengthy runtime and reliance on ground truth (GT) motion masks for application to dynamic scenes. Many efforts attempted to improve it by incorporating more priors as supervision such as GT focal length, motion masks, 3D point clouds, camera poses, and metric depth, which, however, are typically unavailable in casually captured RGB videos. In this paper, we propose a novel method for more accurate and efficient camera parameter optimization in dynamic scenes solely supervised by a single RGB video, dubbed ROS-Cam. Our method consists of three key components: (1) Patch-wise Tracking Filters, to establish robust and maximally sparse hinge-like relations across the RGB video.
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Estimating the uncertainty of responses from Large Language Models (LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically require complex fine-tuning or post-training procedures. In this paper, we propose Training-Free Bayesianization (TFB), a simple yet theoretically grounded framework that efficiently transforms trained low-rank adapters into Bayesian ones without additional training. TFBsystematically searches for the maximally acceptable level of variance in the weight posterior, constrained within a family of low-rank isotropic Gaussian distributions. Our theoretical analysis shows that under mild conditions, this search process is equivalent to KL-regularized variational optimization, a generalized form of variational inference. Through comprehensive experiments, we show that TFB achieves superior uncertainty estimation and generalization compared to existing methods while eliminating the need for complex Bayesianization training procedures.
ASnapshot of Influence: ALocal Data Attribution Framework for Online Reinforcement Learning
Online reinforcement learning (RL) excels in complex, safety-critical domains but suffers from sample inefficiency, training instability, and limited interpretability. Data attribution provides a principled way to trace model behavior back to training samples, yet existing methods assume fixed datasets, which is violated in online RL where each experience both updates the policy and shapes future data collection. In this paper, we initiate the study of data attribution for online RL, focusing on the widely used Proximal Policy Optimization (PPO) algorithm. We start by establishing a local attribution framework, interpreting model checkpoints with respect to the records in the recent training buffer. We design two target functions, capturing agent action and cumulative return respectively, and measure each record's contribution through gradient similarity between its training loss and these targets. We demonstrate the power of this framework through three concrete applications: diagnosis of learning, temporal analysis of behavior formation, and targeted intervention during training. Leveraging this framework, we further propose an algorithm, iterative influence-based filtering (IIF), for online RL training that iteratively performs experience filtering to refine policy updates. Across standard RL benchmarks (classic control, navigation, locomotion) to RLHF for large language models, IIF reduces sample complexity, speeds up training, and achieves higher returns. Together, these results open a new direction for making online RL more interpretable, efficient, and effective.
AbsenceBench: Language Models Can't Tell What's Missing Harvey Yiyun Fu,1, Aryan Shrivastava1, Jared Moore2 Peter West2, Chenhao Tan1, Ari Holtzman1 1University of Chicago 2Stanford University
Large language models (LLMs) are increasingly capable of processing long inputs and locating specific information within them, as evidenced by their performance on the Needle in a Haystack (NIAH) test. However, while models excel at recalling surprising information, they still struggle to identify clearly omitted information. We introduce AbsenceBench to assesses LLMs' capacity to detect missing information across three domains: numerical sequences, poetry, and GitHub pull requests. AbsenceBenchasks models to identify which pieces of a document were deliberately removed, given access to both the original and edited contexts. Despite the apparent straightforwardness of these tasks, our experiments reveal that even state-of-the-art models like Claude-3.7-Sonnet